2023-09-05
Data-centric AI is an emerging concept that emphasizes the importance of data quality and data engineering in building AI systems. Data-centric AI aims to improve the performance and robustness of AI models by systematically characterizing, evaluating, and monitoring the underlying data used to train and evaluate them⁴. Data-centric AI also involves using data-driven methods and tools to inform the considerations at each stage of the ML pipeline⁴.
One of the tools that can help with data-centric AI is Data Version Control (DVC), which is a system for versioning machine learning models, data sets, and intermediate files. DVC connects them with code, and uses various storage options to store file contents³. DVC allows users to track and reproduce the experiments, share data and models, and collaborate effectively on AI projects³.
If you want to learn more about data-centric AI, you can check out some of the web search results I found for you. For example, you can read a survey paper that discusses the necessity, goals, methods, challenges, and benchmarks of data-centric AI¹. You can also watch a video lecture by Andrew Ng, who popularized the term data-centric AI⁶. Or you can explore a website that provides a checklist and resources for applying data-centric AI in practice⁴. I hope this helps you understand what data-centric AI is and how DVC can be useful for it. 😊
Source: Conversation with Bing, 9/14/2023
Touvron, Hugo, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux,
Timothée Lacroix, Baptiste Rozière, et al. “LLaMA: Open and Efficient Foundation Language Models.”
arXiv, February 27, 2023. https://doi.org/10.48550/arXiv.2302.13971.
Sparks of AGI: Early Experiments with GPT-4, 2023.
Textbooks Are All You Need, 2023.
Gunasekar, Suriya, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes,
Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, et al.
“Textbooks Are All You Need.” arXiv, June 20, 2023.
Y. Li, S. Bubeck, R. Eldan, A. Del Giorno, S. Gunasekar, and Y. T. Lee, “Textbooks Are All You Need II: phi-1.5 technical report.” arXiv, Sep. 11, 2023. Accessed: Sep. 12, 2023. [Online]. Available: http://arxiv.org/abs/2309.05463